Overview

Dataset statistics

Number of variables13
Number of observations2985
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory303.3 KiB
Average record size in memory104.0 B

Variable types

Numeric13

Warnings

gross_revenue is highly correlated with qtde_invoices and 1 other fieldsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 1 other fieldsHigh correlation
qtde_products is highly correlated with qtde_invoicesHigh correlation
avg_ticket is highly correlated with qtde_returns and 1 other fieldsHigh correlation
qtde_returns is highly correlated with avg_ticket and 1 other fieldsHigh correlation
avg_basket_size is highly correlated with avg_ticket and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtde_invoices and 3 other fieldsHigh correlation
recency_days is highly correlated with qtde_invoicesHigh correlation
qtde_invoices is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly correlated with avg_unique_basket_sizeHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with gross_revenue and 1 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtde_products and 1 other fieldsHigh correlation
gross_revenue is highly correlated with qtde_invoices and 2 other fieldsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_recency_days is highly correlated with frequencyHigh correlation
frequency is highly correlated with avg_recency_daysHigh correlation
avg_basket_size is highly correlated with qtde_itemsHigh correlation
avg_unique_basket_size is highly correlated with qtde_productsHigh correlation
gross_revenue is highly correlated with avg_ticket and 5 other fieldsHigh correlation
avg_ticket is highly correlated with gross_revenue and 3 other fieldsHigh correlation
qtde_items is highly correlated with gross_revenue and 5 other fieldsHigh correlation
qtde_invoices is highly correlated with gross_revenue and 2 other fieldsHigh correlation
qtde_returns is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_unique_basket_size is highly correlated with qtde_productsHigh correlation
qtde_products is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_basket_size is highly correlated with gross_revenue and 3 other fieldsHigh correlation
avg_ticket is highly skewed (γ1 = 53.71859111) Skewed
qtde_returns is highly skewed (γ1 = 51.85337872) Skewed
avg_basket_size is highly skewed (γ1 = 45.92326414) Skewed
df_index has unique values Unique
customer_id has unique values Unique
recency_days has 34 (1.1%) zeros Zeros
qtde_returns has 1453 (48.7%) zeros Zeros

Reproduction

Analysis started2021-06-13 18:45:24.866468
Analysis finished2021-06-13 18:46:04.471824
Duration39.61 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct2985
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2375.015075
Minimum0
Maximum5896
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:04.906177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile186.2
Q1947
median2163
Q33617
95-th percentile5207.2
Maximum5896
Range5896
Interquartile range (IQR)2670

Descriptive statistics

Standard deviation1604.366167
Coefficient of variation (CV)0.6755183087
Kurtosis-0.9931192818
Mean2375.015075
Median Absolute Deviation (MAD)1302
Skewness0.3594335959
Sum7089420
Variance2573990.798
MonotonicityStrictly increasing
2021-06-13T15:46:05.024176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
27021
 
< 0.1%
6411
 
< 0.1%
52361
 
< 0.1%
50581
 
< 0.1%
26941
 
< 0.1%
6471
 
< 0.1%
26961
 
< 0.1%
6491
 
< 0.1%
6511
 
< 0.1%
Other values (2975)2975
99.7%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
61
< 0.1%
71
< 0.1%
81
< 0.1%
91
< 0.1%
ValueCountFrequency (%)
58961
< 0.1%
58771
< 0.1%
58671
< 0.1%
58611
< 0.1%
58401
< 0.1%
58361
< 0.1%
58301
< 0.1%
58191
< 0.1%
58181
< 0.1%
58081
< 0.1%

customer_id
Real number (ℝ≥0)

UNIQUE

Distinct2985
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15270.2201
Minimum12347
Maximum18287
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:05.148177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum12347
5-th percentile12615.2
Q113792
median15223
Q316771
95-th percentile17964.8
Maximum18287
Range5940
Interquartile range (IQR)2979

Descriptive statistics

Standard deviation1721.133195
Coefficient of variation (CV)0.1127117477
Kurtosis-1.208464419
Mean15270.2201
Median Absolute Deviation (MAD)1490
Skewness0.02916556856
Sum45581607
Variance2962299.476
MonotonicityNot monotonic
2021-06-13T15:46:05.276175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
163841
 
< 0.1%
129871
 
< 0.1%
149841
 
< 0.1%
170331
 
< 0.1%
137041
 
< 0.1%
129391
 
< 0.1%
170371
 
< 0.1%
141251
 
< 0.1%
133631
 
< 0.1%
181641
 
< 0.1%
Other values (2975)2975
99.7%
ValueCountFrequency (%)
123471
< 0.1%
123481
< 0.1%
123521
< 0.1%
123561
< 0.1%
123581
< 0.1%
123591
< 0.1%
123601
< 0.1%
123621
< 0.1%
123641
< 0.1%
123701
< 0.1%
ValueCountFrequency (%)
182871
< 0.1%
182831
< 0.1%
182821
< 0.1%
182771
< 0.1%
182761
< 0.1%
182741
< 0.1%
182731
< 0.1%
182721
< 0.1%
182701
< 0.1%
182691
< 0.1%

gross_revenue
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2979
Distinct (%)99.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2782.437548
Minimum6.2
Maximum280206.02
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:05.415178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum6.2
5-th percentile229.62
Q1573.22
median1098.43
Q32330.92
95-th percentile7320.916
Maximum280206.02
Range280199.82
Interquartile range (IQR)1757.7

Descriptive statistics

Standard deviation10635.51096
Coefficient of variation (CV)3.822371853
Kurtosis348.5532904
Mean2782.437548
Median Absolute Deviation (MAD)682.35
Skewness16.62512019
Sum8305576.08
Variance113114093.5
MonotonicityNot monotonic
2021-06-13T15:46:05.548178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
632
 
0.1%
734.942
 
0.1%
379.652
 
0.1%
745.062
 
0.1%
3312
 
0.1%
731.92
 
0.1%
26879.041
 
< 0.1%
284.461
 
< 0.1%
610.521
 
< 0.1%
605.121
 
< 0.1%
Other values (2969)2969
99.5%
ValueCountFrequency (%)
6.21
< 0.1%
6.91
< 0.1%
13.31
< 0.1%
151
< 0.1%
36.561
< 0.1%
521
< 0.1%
52.21
< 0.1%
52.21
< 0.1%
62.431
< 0.1%
632
0.1%
ValueCountFrequency (%)
280206.021
< 0.1%
259657.31
< 0.1%
194550.791
< 0.1%
168472.51
< 0.1%
143825.061
< 0.1%
124914.531
< 0.1%
117379.631
< 0.1%
91062.381
< 0.1%
81024.841
< 0.1%
66653.561
< 0.1%

recency_days
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct272
Distinct (%)9.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean64.47303183
Minimum0
Maximum373
Zeros34
Zeros (%)1.1%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:05.682177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2
Q111
median32
Q382
95-th percentile242
Maximum373
Range373
Interquartile range (IQR)71

Descriptive statistics

Standard deviation77.76614786
Coefficient of variation (CV)1.206181029
Kurtosis2.750197
Mean64.47303183
Median Absolute Deviation (MAD)26
Skewness1.790720645
Sum192452
Variance6047.573753
MonotonicityNot monotonic
2021-06-13T15:46:05.819177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
199
 
3.3%
487
 
2.9%
385
 
2.8%
285
 
2.8%
876
 
2.5%
1067
 
2.2%
967
 
2.2%
766
 
2.2%
1764
 
2.1%
2255
 
1.8%
Other values (262)2234
74.8%
ValueCountFrequency (%)
034
 
1.1%
199
3.3%
285
2.8%
385
2.8%
487
2.9%
543
1.4%
766
2.2%
876
2.5%
967
2.2%
1067
2.2%
ValueCountFrequency (%)
3732
0.1%
3724
0.1%
3711
 
< 0.1%
3681
 
< 0.1%
3664
0.1%
3653
0.1%
3641
 
< 0.1%
3601
 
< 0.1%
3591
 
< 0.1%
3584
0.1%

qtde_invoices
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct59
Distinct (%)2.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.734338358
Minimum1
Maximum209
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:05.969179image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile17
Maximum209
Range208
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.90180138
Coefficient of variation (CV)1.552367653
Kurtosis194.6807795
Mean5.734338358
Median Absolute Deviation (MAD)2
Skewness10.86872714
Sum17117
Variance79.2420678
MonotonicityNot monotonic
2021-06-13T15:46:06.103176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2788
26.4%
3505
16.9%
4385
12.9%
5242
 
8.1%
1193
 
6.5%
6172
 
5.8%
7143
 
4.8%
898
 
3.3%
968
 
2.3%
1054
 
1.8%
Other values (49)337
11.3%
ValueCountFrequency (%)
1193
 
6.5%
2788
26.4%
3505
16.9%
4385
12.9%
5242
 
8.1%
6172
 
5.8%
7143
 
4.8%
898
 
3.3%
968
 
2.3%
1054
 
1.8%
ValueCountFrequency (%)
2091
< 0.1%
2011
< 0.1%
1241
< 0.1%
971
< 0.1%
931
< 0.1%
911
< 0.1%
861
< 0.1%
731
< 0.1%
631
< 0.1%
621
< 0.1%

qtde_items
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1677
Distinct (%)56.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1605.125963
Minimum1
Maximum196915
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:06.249182image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile101
Q1294
median634
Q31398
95-th percentile4406.4
Maximum196915
Range196914
Interquartile range (IQR)1104

Descriptive statistics

Standard deviation5878.60124
Coefficient of variation (CV)3.66239247
Kurtosis467.002295
Mean1605.125963
Median Absolute Deviation (MAD)418
Skewness17.8666028
Sum4791301
Variance34557952.54
MonotonicityNot monotonic
2021-06-13T15:46:06.394178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8811
 
0.4%
31010
 
0.3%
30010
 
0.3%
1509
 
0.3%
848
 
0.3%
2728
 
0.3%
3948
 
0.3%
2467
 
0.2%
2647
 
0.2%
1527
 
0.2%
Other values (1667)2900
97.2%
ValueCountFrequency (%)
11
< 0.1%
22
0.1%
31
< 0.1%
122
0.1%
161
< 0.1%
171
< 0.1%
181
< 0.1%
191
< 0.1%
201
< 0.1%
231
< 0.1%
ValueCountFrequency (%)
1969151
< 0.1%
809971
< 0.1%
802651
< 0.1%
773741
< 0.1%
699931
< 0.1%
645491
< 0.1%
641241
< 0.1%
633121
< 0.1%
583431
< 0.1%
578851
< 0.1%

qtde_products
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct460
Distinct (%)15.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.6338358
Minimum1
Maximum7847
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:06.541178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile9
Q129
median66
Q3135
95-th percentile382
Maximum7847
Range7846
Interquartile range (IQR)106

Descriptive statistics

Standard deviation269.7042745
Coefficient of variation (CV)2.199264768
Kurtosis356.0012832
Mean122.6338358
Median Absolute Deviation (MAD)44
Skewness15.73037818
Sum366062
Variance72740.3957
MonotonicityNot monotonic
2021-06-13T15:46:06.675176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2842
 
1.4%
2938
 
1.3%
1938
 
1.3%
3538
 
1.3%
2036
 
1.2%
1132
 
1.1%
1531
 
1.0%
3131
 
1.0%
2630
 
1.0%
2730
 
1.0%
Other values (450)2639
88.4%
ValueCountFrequency (%)
16
 
0.2%
214
0.5%
318
0.6%
416
0.5%
528
0.9%
626
0.9%
718
0.6%
822
0.7%
927
0.9%
1025
0.8%
ValueCountFrequency (%)
78471
< 0.1%
56751
< 0.1%
51111
< 0.1%
45951
< 0.1%
27001
< 0.1%
23791
< 0.1%
20761
< 0.1%
18181
< 0.1%
16771
< 0.1%
16371
< 0.1%

avg_ticket
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct2984
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51.74349696
Minimum2.150588235
Maximum56157.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:06.812177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum2.150588235
5-th percentile4.919743966
Q113.1605
median18.18108696
Q325.31482759
95-th percentile90.72201312
Maximum56157.5
Range56155.34941
Interquartile range (IQR)12.15432759

Descriptive statistics

Standard deviation1033.269312
Coefficient of variation (CV)19.96906612
Kurtosis2916.221951
Mean51.74349696
Median Absolute Deviation (MAD)6.130240562
Skewness53.71859111
Sum154454.3384
Variance1067645.471
MonotonicityNot monotonic
2021-06-13T15:46:06.937215image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
16.833333332
 
0.1%
17.492758621
 
< 0.1%
9.4182926831
 
< 0.1%
5.011739131
 
< 0.1%
18.822906981
 
< 0.1%
28.899687941
 
< 0.1%
46.074130431
 
< 0.1%
25.775384621
 
< 0.1%
8.7451724141
 
< 0.1%
18.150615381
 
< 0.1%
Other values (2974)2974
99.6%
ValueCountFrequency (%)
2.1505882351
< 0.1%
2.43251
< 0.1%
2.4623711341
< 0.1%
2.5048760331
< 0.1%
2.508371561
< 0.1%
2.651
< 0.1%
2.6569318181
< 0.1%
2.7075982531
< 0.1%
2.7606215721
< 0.1%
2.7710052911
< 0.1%
ValueCountFrequency (%)
56157.51
< 0.1%
4453.431
< 0.1%
2027.861
< 0.1%
1687.21
< 0.1%
952.98751
< 0.1%
931.51
< 0.1%
872.131
< 0.1%
835.8641
< 0.1%
643.85857141
< 0.1%
6401
< 0.1%

avg_recency_days
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1255
Distinct (%)42.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66.90582076
Minimum1
Maximum366
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:07.069177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile7.407171717
Q125.28571429
median47.66666667
Q385
95-th percentile201
Maximum366
Range365
Interquartile range (IQR)59.71428571

Descriptive statistics

Standard deviation63.47649879
Coefficient of variation (CV)0.9487440416
Kurtosis4.941588072
Mean66.90582076
Median Absolute Deviation (MAD)26.12121212
Skewness2.073697691
Sum199713.875
Variance4029.265899
MonotonicityNot monotonic
2021-06-13T15:46:07.200176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1424
 
0.8%
423
 
0.8%
7022
 
0.7%
720
 
0.7%
119
 
0.6%
3519
 
0.6%
2118
 
0.6%
1118
 
0.6%
4618
 
0.6%
4918
 
0.6%
Other values (1245)2786
93.3%
ValueCountFrequency (%)
119
0.6%
1.51
 
< 0.1%
214
0.5%
2.51
 
< 0.1%
2.5655172411
 
< 0.1%
315
0.5%
3.2719298251
 
< 0.1%
3.3214285711
 
< 0.1%
3.52
 
0.1%
423
0.8%
ValueCountFrequency (%)
3661
 
< 0.1%
3651
 
< 0.1%
3631
 
< 0.1%
3621
 
< 0.1%
3572
0.1%
3561
 
< 0.1%
3552
0.1%
3521
 
< 0.1%
3512
0.1%
3503
0.1%

frequency
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION

Distinct1355
Distinct (%)45.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.06544749662
Minimum0.005449591281
Maximum4
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:07.334176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0.005449591281
5-th percentile0.009529875645
Q10.01792114695
median0.0297029703
Q30.05660377358
95-th percentile0.2222222222
Maximum4
Range3.994550409
Interquartile range (IQR)0.03868262663

Descriptive statistics

Standard deviation0.1464278785
Coefficient of variation (CV)2.237333528
Kurtosis207.4452392
Mean0.06544749662
Median Absolute Deviation (MAD)0.01466537631
Skewness10.83265063
Sum195.3607774
Variance0.02144112361
MonotonicityNot monotonic
2021-06-13T15:46:07.473178image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.0277777777821
 
0.7%
0.166666666721
 
0.7%
0.333333333320
 
0.7%
0.0909090909118
 
0.6%
117
 
0.6%
0.417
 
0.6%
0.062516
 
0.5%
0.0238095238116
 
0.5%
0.0357142857115
 
0.5%
0.133333333315
 
0.5%
Other values (1345)2809
94.1%
ValueCountFrequency (%)
0.0054495912811
 
< 0.1%
0.0054644808741
 
< 0.1%
0.0054945054951
 
< 0.1%
0.0055096418731
 
< 0.1%
0.0055865921792
0.1%
0.0056022408961
 
< 0.1%
0.0056179775282
0.1%
0.005665722381
 
< 0.1%
0.0056818181822
0.1%
0.0056980056983
0.1%
ValueCountFrequency (%)
41
 
< 0.1%
21
 
< 0.1%
1.5714285711
 
< 0.1%
1.53
 
0.1%
117
0.6%
0.83333333331
 
< 0.1%
0.751
 
< 0.1%
0.666666666713
0.4%
0.66487935661
 
< 0.1%
0.61
 
< 0.1%

qtde_returns
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
SKEWED
ZEROS

Distinct217
Distinct (%)7.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean63.24020101
Minimum0
Maximum80995
Zeros1453
Zeros (%)48.7%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:07.622175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q39
95-th percentile103.8
Maximum80995
Range80995
Interquartile range (IQR)9

Descriptive statistics

Standard deviation1509.254596
Coefficient of variation (CV)23.86543009
Kurtosis2774.251512
Mean63.24020101
Median Absolute Deviation (MAD)1
Skewness51.85337872
Sum188772
Variance2277849.436
MonotonicityNot monotonic
2021-06-13T15:46:07.752176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01453
48.7%
1183
 
6.1%
2153
 
5.1%
3107
 
3.6%
490
 
3.0%
672
 
2.4%
564
 
2.1%
849
 
1.6%
1248
 
1.6%
747
 
1.6%
Other values (207)719
24.1%
ValueCountFrequency (%)
01453
48.7%
1183
 
6.1%
2153
 
5.1%
3107
 
3.6%
490
 
3.0%
564
 
2.1%
672
 
2.4%
747
 
1.6%
849
 
1.6%
938
 
1.3%
ValueCountFrequency (%)
809951
< 0.1%
90141
< 0.1%
80601
< 0.1%
46271
< 0.1%
37681
< 0.1%
33351
< 0.1%
29751
< 0.1%
20221
< 0.1%
20121
< 0.1%
19201
< 0.1%

avg_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct1983
Distinct (%)66.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean246.3714787
Minimum1
Maximum40498.5
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:07.897177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1103
median172
Q3281
95-th percentile598.64
Maximum40498.5
Range40497.5
Interquartile range (IQR)178

Descriptive statistics

Standard deviation782.4497016
Coefficient of variation (CV)3.175894003
Kurtosis2350.161191
Mean246.3714787
Median Absolute Deviation (MAD)82.5
Skewness45.92326414
Sum735418.8639
Variance612227.5356
MonotonicityNot monotonic
2021-06-13T15:46:08.033180image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
10011
 
0.4%
8210
 
0.3%
11410
 
0.3%
609
 
0.3%
1369
 
0.3%
738
 
0.3%
868
 
0.3%
1507
 
0.2%
647
 
0.2%
1307
 
0.2%
Other values (1973)2899
97.1%
ValueCountFrequency (%)
12
0.1%
1.51
< 0.1%
21
< 0.1%
3.3333333331
< 0.1%
5.3333333331
< 0.1%
5.6666666671
< 0.1%
6.1428571431
< 0.1%
7.51
< 0.1%
91
< 0.1%
9.51
< 0.1%
ValueCountFrequency (%)
40498.51
< 0.1%
6009.3333331
< 0.1%
3684.476191
< 0.1%
28801
< 0.1%
2697.4657531
< 0.1%
2183.21
< 0.1%
2160.3333331
< 0.1%
2141.51
< 0.1%
2082.2258061
< 0.1%
20001
< 0.1%

avg_unique_basket_size
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct1015
Distinct (%)34.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean22.02564613
Minimum1
Maximum300.6470588
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size23.4 KiB
2021-06-13T15:46:08.184181image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3.478947368
Q110
median17.22222222
Q327.71428571
95-th percentile56.71515152
Maximum300.6470588
Range299.6470588
Interquartile range (IQR)17.71428571

Descriptive statistics

Standard deviation18.94048787
Coefficient of variation (CV)0.8599288192
Kurtosis23.48302059
Mean22.02564613
Median Absolute Deviation (MAD)8.222222222
Skewness3.169981848
Sum65746.55369
Variance358.7420806
MonotonicityNot monotonic
2021-06-13T15:46:08.314177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1354
 
1.8%
1444
 
1.5%
1135
 
1.2%
133
 
1.1%
932
 
1.1%
2031
 
1.0%
1729
 
1.0%
1029
 
1.0%
628
 
0.9%
1528
 
0.9%
Other values (1005)2642
88.5%
ValueCountFrequency (%)
133
1.1%
1.21
 
< 0.1%
1.251
 
< 0.1%
1.3333333332
 
0.1%
1.59
 
0.3%
1.5555555561
 
< 0.1%
1.5714285711
 
< 0.1%
1.6666666674
 
0.1%
1.8333333331
 
< 0.1%
1.91
 
< 0.1%
ValueCountFrequency (%)
300.64705881
< 0.1%
203.51
< 0.1%
1491
< 0.1%
145.33333331
< 0.1%
136.251
< 0.1%
135.751
< 0.1%
1271
< 0.1%
1221
< 0.1%
1181
< 0.1%
1141
< 0.1%

Interactions

2021-06-13T15:45:42.612821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:42.764822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:42.865858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:42.973821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.084822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.184822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.307828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.430824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.546862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.658822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.770823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:43.887828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.005822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.116822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.240860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.346862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.456824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.567827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.675856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.794822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:44.917827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.031860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.142825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.261822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.386827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.506858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.621825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.738822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.860824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:45.973824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.201821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.303867image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.417822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.545825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.656862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.763857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:46.877822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.005822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.118823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.228826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.345822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.456824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.569826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.685827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.796860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:47.928822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.047835image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.158822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.276825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.402825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.523827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.650862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.789823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:48.902828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.006860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.124823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.249824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.377829image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.490861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.599827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.710824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.821824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:49.927827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.037827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.148859image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.254821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.379822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.505824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.627831image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:50.888822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.003822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.131822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.257828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.377824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.506822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.628822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.754822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:51.896860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.020868image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.151826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.276827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.407827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.529827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.659861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.796862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:52.935822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.068825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.228824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.365823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.496822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.630822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.756863image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.871851image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:53.981822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.088822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.196860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.298862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.417832image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.536822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.658865image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.767824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.877829image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:54.998825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.109857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.230860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.350824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.464827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.578861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.692822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.799823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:55.921821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.043822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.161824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.277822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.399824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.522857image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.819860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:56.936824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.053822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.170859image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.287823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.414822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.524822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.647828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.787825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:57.907822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.044822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.181821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.307827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.435860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.553822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.671864image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.789821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:58.906821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.038825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.152824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.277822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.399826image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.522856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.642821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.765827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:45:59.891821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.019823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.164821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.295824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.419860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.544822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.663821image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.778861image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:00.909822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.036827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.157825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.286824image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.416828image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.545822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.671825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.793823image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:01.904822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.024856image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.138862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.269860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.369858image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.488822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.607827image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.723825image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.836822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:02.952866image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:03.078862image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
2021-06-13T15:46:03.193860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Correlations

2021-06-13T15:46:08.525176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-06-13T15:46:08.855175image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-06-13T15:46:09.055176image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-06-13T15:46:09.259177image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-06-13T15:46:03.450822image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-06-13T15:46:03.824860image/svg+xmlMatplotlib v3.4.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
00178505391.21372.034.01733.0297.018.15222235.5000000.48611140.050.9705888.735294
11130473237.5431.010.01391.0172.018.82290726.3076920.05247836.0139.10000017.200000
22125837281.382.015.05060.0247.029.47927121.8235290.04838751.0337.33333316.466667
3313748948.2595.05.0439.028.033.86607192.6666670.0179210.087.8000005.600000
4415100876.00333.03.080.03.0292.0000008.6000000.13636422.026.6666671.000000
55152914668.3025.015.02103.0103.045.32330121.7500000.05730729.0140.2000006.866667
66146885630.877.021.03621.0327.017.21978618.3000000.073569399.0172.42857115.571429
77178095411.9116.012.02057.061.088.71983632.4545450.04189942.0171.4166675.083333
881531160767.900.091.038194.02379.025.5434644.1444440.315508474.0419.71428626.142857
99145278508.822.055.02089.0972.08.7539305.8888890.23118340.037.98181817.672727

Last rows

df_indexcustomer_idgross_revenuerecency_daysqtde_invoicesqtde_itemsqtde_productsavg_ticketavg_recency_daysfrequencyqtde_returnsavg_basket_sizeavg_unique_basket_size
29755808177271060.2515.01.0645.066.016.0643946.00.2857146.0645.00000066.000000
2976581817232421.522.02.0203.036.011.70888912.00.1538460.0101.50000018.000000
2977581917468137.0010.02.0116.05.027.4000004.00.4000000.058.0000002.500000
2978583013596697.045.02.0406.0166.04.1990367.00.2500000.0203.00000083.000000
29795836148931237.859.02.0799.073.016.9568492.00.6666670.0399.50000036.500000
2980584012479527.2011.01.0385.031.017.0064524.00.33333334.0385.00000031.000000
2981586114126706.137.03.0508.015.047.0753333.01.00000050.0169.3333335.000000
29825867135211093.651.03.0736.0436.02.5083724.50.3000000.0245.333333145.333333
2983587715060303.098.04.0263.0121.02.5048761.02.0000000.065.75000030.250000
2984589612558269.967.01.0196.011.024.5418186.00.285714196.0196.00000011.000000